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Abstract — This paper presents a comapraison between different expansion fyfUHCwfor a specific structure 
of neural network as the functional link artificial neural network (Ffc^Wli^^rhis technique has been 
employed for classification tasks of data mining. In fact, there are aNjew^srWlies that used this tool for 
solving classification problems, and in the most case, the trigonom<Cri«expansion function is the most 
used. In this present research, we propose a hybrid FLANN (HI^JfiJN) model, where the optimization 
process is performed using 3 known population based technrf^ef^uch as genetic algorithms, particle 
swarm and differential evolution. This model will be empiri^jly compared using different expansion 
function and the best function one will be selected.. 
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Classification task is a very importan t jjj pif 7h data mining. A lot of research ([1], [2], [3]) has focused on the 
field over the last two decades. TheJfetaViming is a knowledge discovery process from large databases. The 
extracted knowledge will be usaW^numan user for supporting a decision that is the ultimate goal of data 
mining. Therefore, classification o%pision is our aim in this study. A various classification models have been 
used in this regard. M. JarfTelkta] has employed a linear/quadratic discriminates techniques for solving 
classification problems. ^^cfclrer procedure has been applied using decision trees ([5], [6]). In the same 
context, Duda et aljfci^^e proposed a discriminant analysis based on the Bayesian decision theory. 
Nevertheless, these taxational statistical models are built mainly on various linear assumptions that will be 
necessary satisfi^WiJprierwise, we cannot apply these techniques for classification tasks. To overcome the 
disadvantage^HiBHal intelligent tools have been emerged to solve data mining classification problems. For 
this purpoS|^jffietic algorithms models were used [8]. In a recent research, Zhang ([9], [10]) have 
introdu^oSttre neural networks technique as a powerful classification tool. In these studies, he showed that 
neumlVeVvork is a promising alternative tool compared to various conventional classification techniques. 
In^Jff^e recent literature, a specific structure of neural network has been employed for classification task 




<3 



cfefata mining as the functional link artificial neural network (FLANN). In fact, there are a few studies ([11], 



[13]) used this tool for solving classification problems. 



In this present research, we propose a hybrid FLANN (HFLANN) model based on three metaheuristics 
population based optimization tools such: genetic algorithms (GAs), particle swarm optimization (PSO) 
and differential evolution. This model will be compared using different expansion function and the best one 
will be selected. 
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II. Concepts and Definition 

A. Population Based Algorithms 

Population based algorithms are classed as a computational intelligence techniques representing a class of 
robust optimization ones. These population based ones make use of a population of solution in the same 
time based on natural evolution. 



Many population based algorithms are presented in the literature such evolutionary programming ^4jJ 
evolution strategy [15], genetic algorithms [16], genetic programming [17], Ant Colony [18], particleewarm 
[19] and differential evolution [20]. These algorithms differ in selection, offspring generatiorrcind 
replacement mechanisms. Genetic algorithms, particle swarm and differential evolutionS^p^ent the 
most popular ones. ^^/^ 
1) Genetic algorithms #^C^^ 

Genetic algorithms (GAs) are defined as a search technique that was inspired frarrnCarwinian Theory. The 
idea is based on the theory of natural selection. We assume that there^is^^S^alation composed with 
different characteristics. The stronger will be able to survive and theyfcj||^^iir characteristics to their 
offspring's. 

The total process is described as follows: C^S 

3- Apply genetic operators such selection, crossovelTa^ mutation; 

atiOTl" i 



1- Generate randomly an initial population; 

2- Evaluate this population using the fitness function; 



ver mutat 

4f> 



4- Turn the process "Evaluation Crossover mutatTofl" until reaching the stopped criteria fixed in prior. 
2) Particle swarm 

Presented in 1995 by L. KennedC^nVI W^berhart [19], particle swarm optimization (PSO) represents one of 
the most known populatio^baseV^approaches, where particles change their positions with time. These 
particles fly around in a mtfrratimensional search space, and each particle adjusts its position according to 
its own experience and dn^Sjplrience of their neighboring, making use of the best position encountered by 
itself and its neig hNo fr^Jhe direction of a particle is defined by the set of neighboring and its 
correspondent histo^ cftexperience. 



An individuaWJUcfafe i is composed of three vectors: 
- Its posjflfcl^ta the V-dimensional search space 
ximbb% (XEE.XEH,..., XEE)0 
k %E«e b est p OS ition that it has individually found 

Pi GET = (P03, P00,..., P03)I3 
Its velocity VE BEET = (VEE, VEE,..., VEE) 
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Particles were originally initialized in a uniform random manner throughout the search space; velocity is 
also randomly initialized. 

These particles then move throughout the search space by a fairly simple set of update equations. The 
algorithm updates the entire swarm at each time step by updating the velocity and position of each particle 
in every dimension by the following rules: 

. VEE = x * (W * VEE + C * e0 (PEE - XEB) + C * £0 EIPEE - XEGE) (i) /"V> 
XEE = XEE + VEE (2) Qj 
Where in the original equations: 

C is a constant with the value of 2.0 e0 and e0 are independent random numbers unkra^ygenerated at 
every update for each individual dimension (n = 1 to V). 

PEE is the best position found by the global population of particle. 

PEE is the best position found by any neighbor of the particle. 

W: the weight 

X : the constriction factor. 

3) Differential evolution 

Proposed by Storn and Price in 1995 [20], differenUaXevWution represents a new floating evolutionary 
algorithm using a special kind of differential opA^JDrr Easy implementation and negligible parameter 
tuning makes this algorithm quite popular. 




Like any evolutionary algorithm, differentjlf^eyolution starts with a population. Differential evolution is a 
small and simple mathematical model o^^ig and naturally complex process of evolution. So, it is easy and 
efficient. 



rentj^eyt 
o^^ig ai 

Firstly, there are five DE strategi^Mo* schemes) that were proposed by R. Storn and K. Price [20]: 



• Scheme DE/rand/i : 
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• Scheme DE/ra 

to = X5 + F * ( 

• Scher 
to = xbeS^t-"^* (xi - X2) (5) 

>4&*Vne DE/best/2: 



<best + F * (xi + X2 - X3 - X4) (6) 



Scheme DE/rand-to best/i: 

co = x+A* (xbest-xi)+F * (X2-X3) (7) 

Later, two more strategies were introduced [21] . 
We present the trigonometric scheme defined by: 



journals@asdf.res. in www.asdfjournals.com Page 49 of 89 



Vol. 1; Iss 1; Year 2013 



Intl. Jrnl. on Human Machine Interaction 



w = (xi + X2 + X3)/3 + (p2 - pi) * (xi - X2) 

+ (P3 - P 2 ) * ( X2 - x 3) + (P 1 - P3) * ( x 3 - X1 ) (8) 
pi=|f(xi)/ (f(xi) + f( X2 ) + f(x 3 )) |, i= 1,2,3; (9) 

F define the constriction factor generally taken equal to 0.5 
x define the selected element 



xi, X2, X3, X4 and X5 represent random generated elements from the population. 



Many others schemes can be found in the literature [20]. 
B. Functional Link Artificial Neural Networks 



<2> 

tWSwjWde 



The FLANN architecture was originally proposed by Pao et al. [22] . The basic idea of tbftMjnWlel is to apply 
an expansion function which increases the input vector dimensionality. We say^nW me hyper-planes 
generated provide greater discrimination capability in the input pattern space. By^^^ying this expansion, 
we needn't the use of the hidden layer, making the learning algorithm simplprJjjysPcompared to the MLP 
structure, this model has the advantage to have faster convergence rate a»€ J«sencomputational cost. 

The conventional nonlinear functional expansions which can be emplrfy^^welbigonometric, power series, 
Chebyshev polynomials or Chebyshev Legendre polynomials type^RjNi^hi et al. [23], shows that use of 
trigonometric expansion provides better prediction capability oUre^bdel. Hence, in the present case, we 
aim to validate the best expansion function for the proposed mc 



i*/V 
AT 

--Uje^rajrese 
, wih\en = m 



Let each element of the input pattern before expansion ^^represented as X(i), 1 < i < 1 where each element 
x(i) is functionally expanded as Zn(i) , 1 < n < N , w^5|j<m^ = number of expanded points for each input 
element. In this study, we take N=5. 



1= the total number of features 
As presented in figure 1, the expansion o^^^^hput pattern is done as follows. 
Zi(i) = X(i), Za(i) = fi(X(i)),...., ZWi^Q(i)) (10) 

These expanded inputs are %en few to the single layer neural network and the network is trained to obtain 
the desired output. Differerfe»ansion function will be described next. 

C. Expansion functio^^j^^ 

Four expansion ^^kfion will be used in this work such, trigonometric, the polynomial, the Legendre 
polynomia^aj^^fc^ower series. Different characteristics are presented in the 4 graphics. 



Trigonometric expansion 



ZiO)=»(i> 
ZsOl-sinlTT-xO)) 



/ Z 3 (i)=,in(2TT«»(i» 

— O 



Z 4 (i)=cos(Tr"«(i)) 



Z S (1)=C0!(2TT-X(i)) 



-o 

Figure 1. Trigonometric functional expansion of the first element 
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Chebyshev polynomials expansion 
Z n . 1 (x)=2x(i)Z n Cj>Z I ,_ l (i) 



Zl(i)=*(i) 



Z.OJ^-Cxii)) 2 -! 



o 



7Jf)= 8*Mi)) 4 -S'(x(iff+1 



Z 5 (i)= 16>(i)) 5 - 20*x(i)) 3 +5*x(i) 



o 
o 
o 

o 



Figure 2. Chebyshev polynomials functional expansion of the first el 



Chebyshev Legendre polynomials 
expansion 

Z B , a (i)={1/(n+1))*[(2n+1)x(i)Z n (i)-nZ frl (i)] 

Z,(i)=x(i) 



Z 2 (i)=K[3 K (i))g-1] 
Z 3 (i)=y a [5(x(i)) 3 - 3'(x(i))l 





Figure 3. Chebyshev Legend 




omials functional expansion of the first element 



Power series expansion 

Zn-1=[X nt1 (X)r 1 



im 2 



o 
o 
o 

o 

o 



Figure 4. Power series functional expansion of the first element 

III. Hybrid Flann Description 

The proposed hybrid FLANN is based on evolutionary algorithms as genetic algorithms, particle swarm and 
differential evolution. 
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A. Resampling Technique: 

In order to avoid over fitting, we use the (2*5) K fold cross validation resampling technique. We proceed as 
follows: 



We divide initial database into 5 folds (K=5) where each one contain the same repartition of classes. For 
example, if initial population contains 60% of class 1 and 40% of class 2, then all the resulted K folds must 
have the same repartition. — \ 

B. Generation I « 

We begin the process by generating randomly initial solution. We execute partial training UMn^Jtffetential 
evolution in order to improve initial state. X^^*^ 

In order to evaluate each solution, two criterions are used such the mean sguaT||%rror (MSE) and the 
misclassification error (MCE) rate. If we have to compare solutions A and B, 'wn^fjff the following rules: A 
is preferred to B If and only if MCE(A)< MCE(B) Or MCE(A)= MCE(B) an|l\K<(J}< MSE(B). 



D. Selection 

Many selections are defined in the literature such the R 011 1 etteytdTP^ln eth od , the N/2 elitist method and 
the tournament selection method. The last method will be uf pjhere. The principle is to compare two 
solutions, and the best one will be selected. 



N/2 elitist is used at the beginning of the process in oadafurcelect 50% of generated solution. 




jssover 

Two parents are selected randomly in orda^to exchange their information. Two crossovers are applied and 
described as follows: 

1) Crossover 1 (over input featureJf ^^^iput I 
weight between the selected twiXamnts. 

C V 

2) Crossover 2 (over outputmocfes) : An output is chosen randomly to exchange his correspondent weight. 



Drda^to e 




3) Crossover 3 (CfoA^gJ over connection): A connection position is chosen randomly and his 
correspondent weig»t i^xchanged between the two parents. 



.veigJMi 

/NV F. Mutation 

fl?^V v er 
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ita^^^Wer connection) 

tion position is chosen randomly and his correspondent weight has been controlled. If this 
tion is connected, his correspondent weight is disconnected by setting his value equal to zero. Else, 
connection is connected. 




0c 



2) Mutation 2 (over one input feature) 

An input feature is chosen randomly and his correspondent weights have been controlled. If this input 
feature is connected (there is at least one weights of his correspondent ones is different from zero), it will 
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be disconnected by putting all his entire weight equal to zero. Else if this input feature is totally 
disconnected, it will be connected there by generating weights different from zero. 



3) Mutation 3 (over two input feature) 

We do the same like mutation 2 but here simultaneously for the two selected features 

4) Mutation 4 ( over three input feature) 



Cr 

In this mutation, the same principle is used for three input features. ^^J^ 



We note that many input features connection and disconnection can be executed in the s^rn^ii»e*when 
having a large number of features. This crossover helps to remove undesirable featj^^*m)m our 
classification process and can improve the final performance process. 



G. Particle swarm optimization (PSO) 

In the presented paper, we define three PSO model based on the notion of ne 

1) PSO based on resulted genetic offspring's: First, we apply genetic opera#6 > is. |E_aJ i offspring that 
improve our fitness function define a neighbor, and used in equation (1 

2) PSO based on Euclidian distance: For each particle, we comprf^Jh^Euclidian distance between this 
particle and the rest of the population. Next we choose the fiM^neai^st particles based on this distance. 
From the selected subset of neighbors, we choose the best Cs 

one which has the best fitness value. This selected one ur neighbor to be replaced in equation (1). 




3) PSO based on the last best visited solution: lA^s*fcase, each particle flies and memorizes his best 
reached solution. This memory defines the neigh^^o De used in equation (1). 



In this work, we proceed as follows: 



H. Deferential evolution 



- First, for each candidate x, we glto^ate five random solution xi, X2, X3, X4 and X5. 

- Next we apply seven chose^^hemes as follows: 

DEi: Scheme DE/dir^t/^) 

to = X + F* (X2 - Xl) 



DE2: SchemeDE7|gJr7i : 
to = xbest VFf (xl - xi) (12) 




Scheme DE/best/i : 
xbest + F* (X3 - xi) (14) 



DE5: Scheme DE/best/2 : 

to = xbest + F * (xi + X2 - X3 - X4) (15) 

DE6: Scheme DE/rand/2 : 

to = X5 + F * (xi + X2 - X3 - X4) (16) 
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DE7: with Trigonometric Mutation: 

to = (xi + X2 + X3)/3 + (p2 - pi) * (xi - X2) + (p3 - p2) * (x2 - X3) + (pi - P3) * (X3 - xi) (17) 
pi = |f(xi)/ (f(xi) + f( X2 ) + f(x 3 )) I, i= i, 2, 3 ; (18) 

I. Stopping criterion: 



The process turns in a cycle until reaching a maximum number of epochs without any improvement. We^ 
the maximum number of epochs equal to 30 epochs. 



fir 



IV. Experimental Studies 

11 real-world databases were selected there to be used in simulation works. They are cho 

lasetf c 



: cnosj^m 

repository machine learning, which is commonly used to benchmark learning algorithmsjl^.^ 



the UCI 



We compare the results of the proposed hybrid FLANN (HFLANN) with FLAN 
descent algorithm. Next, Comparison with other classifiers will be done 

A. Description of The Databases 




on the gradient 



distinct labels. 



A brief description of used databases for experimental setup is prese 
features, Bin. is the binary ones, and Nom. is the nominal inputs^j*t 

Table I. Summary of The Dataset Lke\ ir*Simulation Studies 



able I. Num. is the numeric 
discrete with three or more 




B. Initial population improvement: 

A random generated population will be generated randomly and their performance is presented in column 
2 of table II. Random generation gives worst results needing some initial improvement. For this aim, we 
propose to use two prior improving algorithms: the back-propagation, the differential evolution, and a 
mixed back-propagation differential evolution one. From column 3 and column 4, we observe that the back 
propagation one has a better improving performance than the differential evolution. In the last column, 
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mixed back propagation- differential evolution results are presented. Compared to single algorithm results, 
the mixed algorithm gives the better result and it we be used in our process as a prior improving algorithm. 

Table II. Summary of The Dataset Used in Simulation Studies 




Random 
Generation 



Random Generation with 
BP 



In order to test the conver 
FLANN using the back-pro 
done based on the reau 



Random Generation with 
DE 



Convergence test: 

the proposed hybrid FLANN, a comparison will be done with trained 
ion algorithm. Results are presented in figure 5 and figure 6. Comparison is 
and number of epochs for convergence. 



From figure 5, wejQAd^fet our process needs less than 200 seconds 20 epochs to converge. Figure 6 present 
results for FLANfs^mged on back-propagation. This model requires less than 150 seconds and 15 epochs to 
converge '^^(^^ 

ybrid FLANN has a strong ability to converge fast and requires approximately the same time 
than FLANN based back-propagation. 
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/bricfrFL 



MSE vs epochs 
B.MSE vs epochs 



Figure 5. The MSE^ybricEFLANN results vs. time and epochs applied to the iris database 



J? 




Ban 1 □□□ 



a. MSE vs Time 
A. MSE vs Time 
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b. MSE vs epochs 
B. MSE vs epochs 



X 

Figure 6. The MSE FLANN based back-propagation results vs. time and epochs acpima 



to the iris database 



D. Comparative study: 



Table III. Average Comparative Performance of Hflann Based 




xpansion Function 



Chebytch^V 
Hybrid KLArfcf 
with Lo^DBearch 



Power Series 

Hybrid 
FLANN with 
Local Search 



Legend 
Chebytchev 
Hybrid FLANN 
with Local Search 
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^^^e III present results of the proposed model using four different expansion functions. We find that our 
"""odel gives better results that the FLANN based back-propagation algorithms. 



By comparing different expansion function, we find that the trigonometric expansion function is the best 
one having the best mean of performance (0.84608), and the little mean of standard deviation (0.05276). 
This expansion function gives the best results over 7 databases from u. 

We can conclude that the trigonometric expansion function is the best one. 



journals@asdf.res. in 



www.asdfjournals.com 



Page 57 of 89 



Vol. 1; Iss 1; Year 2013 



Intl. Jrnl. on Human Machine Interaction 



V. Conclusion 

A HFLANN was proposed based on three populations based algorithms such genetic algorithms, differential 
evolution and particle swarm. This classifier shows his ability to converge faster and gives better 
performance than FLANN based on back-propagation. 



Based on our experimentation, and compared to others expansion function, the trigonometric one is foun 
the best one. 

In future work, we can add a wrapper approach able to delete automatically irrelevant features. We c^njjlso 
apply the HEFLANN to others data mining problems such prediction. ^» 

Others evolutionary algorithms can be included in the proposed process in order to perfoaf^jTB$ter results. 
Others comparison criteria can be used such the needed speed and the robustness of thaafi^r^hm 
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